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ABSTRACT 

It is classical that univariate algebraic functions satisfy lin- 
ear differential equations with polynomial coefficients. Lin- 
ear recurrences follow for the coefficients of their power series 
expansions. We show that the linear differential equation of 
minimal order has coefficients whose degree is cubic in the 
degree of the function. We also show that there exists a lin- 
ear differential equation of order linear in the degree whose 
coefficients are only of quadratic degree. Furthermore, we 
prove the existence of recurrences of order and degree close 
to optimal. We study the complexity of computing these 
differential equations and recurrences. We deduce a fast al- 
gorithm for the expansion of algebraic series. 

Categories and Subject Descriptors: 1.1.2 [Symbolic 
and Algebraic Manipulation]: Algorithms 

General Terms: Algorithms, Experimentation, Theory 

Keywords: Computer algebra, algebraic series, differential 
resolvents, creative telescoping, complexity 

1. INTRODUCTION 

A power series a(X) £ ¥L[[X]] is called algebraic over the 
field K when there is a nonzero polynomial P £ ¥L[X, Y] 
such that P(X , a(X)) = 0. 

The existence of a linear differential equation with poly- 
nomial coefficients satisfied by algebraic series seems to have 
been observed first in 1827 by Abel [1, p. 287] who did not 
publish it. Cockle gave an algorithm for the computation of 
such a differential equation, that he called a differential re- 
solvent [11]. The method was then presented by Harley [15] 
and rediscovered by Tannery [21] (see also [14]). 

One of the applications of these differential equations is 
the computation of power series solutions: a linear differ- 
ential equation translates into a linear recurrence, with the 
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consequence that the number of operations required to com- 
pute the first N coefficients grows only linearly with N. This 
is useful in combinatorial applications, since the enumera- 
tion sequences of context-free languages have algebraic gen- 
erating series. This method has been described by Comtet 
[f 2] and studied from the complexity point of view by Chud- 
novsky and Chudnovsky [8]. In practice, this procedure will 
be all the more efficient as its dependency on the size of P 
is small. This in turn is related to the order and degree of 
the coefficients of the differential equation. 

We say that a linear differential equation or operator van- 
ishing at all the roots a(X) of P is associated to P. fn this 
work, we derive precise bounds for the order and degree of 
coefficients of several such equations. For brevity and clar- 
ity, the following hypothesis will be assumed throughout. 



Hypothesis (H). The characteristic of the fieldK is 0. 
The polynomial P £ K[X, Y] is separable (wrt Y), i.e., co- 
prime with its derivative Py wrt Y . The total degree of P is 
denoted D and its degrees in X and Y are written Dx and 
Dy ■ We assume that Dy > 1 ■ 

The first work on bounds that we know of in this area is 
due to Cormier et alii [13]. They consider the monic linear 
differential operator of minimal order associated to P, which 
we call the differential resolvent of P. A bound in 0(Dx Dy) 
for the degrees of its coefficients can be deduced from their 
work. Nahay [17, 18] obtained degree bounds in 0(D A ) 
for the coefficients of an analogous equation satisfied by o A 
with A transcendent over K. In §2, we prove the following 
cubic bound (dx denotes differentiation wrt X). 

Theorem 1. Under (H), the differential resolvent of P 
can be written 



„ r M, — i „ r -i . 



Mi Mo 

M r Mr' 



where r < Dy and Mo, . . . , M r in K[X] have degrees at most 

n = {{2r-l)D Y +2r 2 ~4r+3)D x -r{r-l)/2 £ 0(rD x D Y ). 

Experiments show that a cubic degree is to be expected (see 
§4.2). This theorem improves and extends the results of [13, 
17, 18]. It leads to a linear recurrence of order 0(D 3 ) with 
coefficients of degree O(D). Using fast multipoint evaluation 
on an arithmetic progression then leads to a computation of 



the first N coefficients in 0(D 2 M(D)N) arithmetic opera- 
tions (see §3.5). As usual, M(D) denotes a bound on the 
number of arithmetic operations in the ring of coefficients 
needed to multiply two polynomials of degree at most D. 
We make the standard assumption that M(di) + M(d,2) < 
M(di +d 2 ) for all di,d 2 and that DlogD G 0{M(D)). 

One reason for this cubic degree growth is the presence 
of numerous apparent singularities. These are points where 
the differential resolvent is singular (its leading coefficient 
vanishes) while none of its solutions is. However, the number 
of apparent singularities decreases dramatically if one does 
not insist on the order of the equation being minimal. It 
is actually possible to remove all apparent singularities by 
Weyl closure [22], but we do not know how to get bounds 
from this algorithm. 

In §3.2, we show that a moderate increase of the order 
leads to the following quadratic bound. 

Theorem 2. Under (H), there exists a nonzero linear 
differential operator in K[X](9x) associated to P whose de- 
gree in dx is at most 6Dy and with coefficients of degree at 
most 3DxDy ■ 

This operator then translates into a linear recurrence of or- 
der at most 3Dy(Dx + 2) with coefficients of degree at 
most 6Dy- Then, a natural question is the order of a min- 
imal linear recurrence, or equivalently the minimal degree 
in X of linear differential operators in X and Ox = Xdx 
associated to P. In §4.2, we explain why quadratic order is 
optimal. We prove in §3 the following precise bound. 

Theorem 3. Under (H), there exists a non-zero differ- 
ential operator in 'K[X\{6x) associated to P whose degrees 
in X and in Ox are bounded above by 



2D X D Y +D^ 



A + 1 with A = Dx + Dy 



D. 



Experiments showing that this bound is not far from opti- 
mal are reported in §4.2. This result implies the existence 
of a linear recurrence of order 0(D 2 ) with coefficients of 
the same degree, whence a computation of the first N co- 
efficients of series solutions in 0(M(D 2 )N), with a small 
constant in the O(-) term. In §3.4, we go a bit further: with 
a degree in X slightly larger, but still quadratic, we obtain a 
degree in Ox that is only linear; this results in an algorithm 
for algebraic series in 0(DM(D)N) operations only. 

2. DIFFERENTIAL RESOLVENT 

We write Qi, . . . ,old y for the zeroes of P(X, •) in the al- 
gebraic closure K.(X) of K.(X). Since P is separable wrt Y, 
the derivation dx uniquely extends to the splitting field 
K(X)(ai,...,a DY )oiPbyd xai = -Px(X,a t )/P Y (X, ai ). 

2.1 Cockle's Algorithm 

Cockle's algorithm [11] computes the differential resol- 
vent. It reduces the computation to linear algebra in the 
quotient ring K(X)[Y]/ (P), that is a vector space of dimen- 
sion Dy over K(X). The heart of the idea is to express the 
successive derivatives of any Qi as a polynomial of degree at 
most Dy — 1 in itself. In order to prepare the derivation of 
bounds, we introduce two sequences of polynomials. 

First, polynomials (W k ) k >i in M.[X, Y] satisfying 



are defined by induction: Wi = —Px, and for all k > 1, 
W fc+ i = (P Y dxW k - P x d Y W k )P Y - (2k - l)(P Y dxP Y - 
Pxd Y P Y )W k . They satisfy 

deg x (W k ) < (2D X - l)k - D x , 
deg Y (W k ) < 2(D Y - l)fc - D Y + 2. 

Cockle's algorithm computes a second sequence of polyno- 
mials, (Vfe)fc>o, such that V k (a) is the image of W k IP Y h ~ x 
in K(X)[Y]/{P) for k > 1, while Vo(a) = a is that of Y. 
The computation is as follows: 

1. Vi is computed by first inverting Py modulo P and 
then reducing the product with —Px modulo P; 

2. for k — 0, 1, ... , the polynomial Vfc+i is computed by 
taking the remainder of dxV k + VidyVk modulo P; 

3. the computation stops with the first integer r < Dy 
such that V r belongs to the K(X)-vector space spanned 
by Vo, ■ ■ ■ , V r -i- Then, together with r, Cockle's algo- 
rithm computes the relation 



Vr = Ar^Vr-i + ■■■ + A V , 



(1) 



and returns the unique monic differential operator of 
minimal order A 
associated to P. 



minimal order M — d x — A, — id x 1 — ■ ■ ■ — A\d x — Aq 



2.2 Degree Bound 

We are now going to prove Thm. 1 by establishing a for- 
mula for the differential resolvent in terms of the elementary 
symmetric polynomials of the on. This idea originates in 
Nahay's PhD thesis [16, Thm. 46, p. 96] (published in [18, 
Thm. 9.1]). 

Let Y\ , . . . , Yd y be new indeterminates and N the matrix: 



/ Y 1 P Y (X,Y 1 ) 2 ''- 1 
W 1 {X,Y 1 )P Y (X,Y 1 ) 2r - 

w 2 (x,y 1 )p v (x,y 1 ) 2 ''- 



V 



W r (X,Y!) 



Y r P Y (X,Y r ) 2r - 
Wi(X, Y r )P Y (X, Y T 
W 2 (X, Y r )P Y (X. Y r 



W r (X, Y r ) 



By subtracting the jth column of N to the ith one (i < j < 
r) we observe that Yi — K, divides the determinant det(Af) of 
Af, hence S := Tl 1<i<j<r (Yi - Yj) divides det(M). The op- 
erator L = det(A0/<5 thus belongs to KLY][Yi, . . . , Y r ](d x ) 
and it is symmetric in Yi, . . . ,Y r . 

By definition of the W k 's, for all a in the permutation 
group Sn Y of {1, ... , Dy} and all i € {1, . . . , r}, we have 



L(a CT (i), . . . , a CT ( r ), dx)on = 



Y[P Y {X,a„ (j) 



Wr(a 



■7(1),- 



, "cr(r), < 



<5(a CT (i), 



0. 



, Vt-<r(r)) 



where Wr(-) is the Wronskian determinant. In other words 
L(q.„(i), . . . ,a a ( r ),d x ) is associated to P for all a G Sd y - 

For all k G {0, . . . , r}, the entries of the (k + l)th row of 
Af have degree in X at most m k = (2r— l)Dx —k, except in 
the last column. Therefore, for all i 6 {0, ... , r}, the degree 
of the coefficient Li of d x in L is bounded by: 

r 

deg x (L z )< J2 m k <r(2r-l)Dx-r(r-l)/2=:m x . 

k =0,k=ii 



On the other hand, for all I G {1, . . . ,r}, the entries of the 
Zth column of Af have degree in Yi at most (2r — 1)Dy ~ 



2r + 2. Therefore, for all i G {0, . . . , r}, all j > 0, and all 
I G {1, . . . , r}, the degree of the coefficient Ltj of X 3 in Li 
is bounded by: 

degy^L^) < (2r-l)D Y -2r + 2- (r-1) =: my. 

By definition of r, there exists a permutation a G Sd y 
such that L{a a n\ , . . . , «o-( r ) , dx) has order r exactly. There- 
fore, Lemma 1 below provides us with a nonzero operator 
L G K[X][Yi, . . . , Ydy](8x) symmetric in Yl, . . . , Yo Y , such 
that L(ai, . . . , old y > dx) has order r, is associated to P, and 
degy. (L) < max(my , Dy — 1) = my, for all 2 G {1, . . . , Dy}. 

Let pi denote the coefficient of Y 1 in P seen in K[A][Y]. 
By Lemma 2 below, there exists L G K[A] [Z\, . . . , 
of total degree at most ray in Z\ , . . . , Zo Y > of partial de- 
gree in X at most mx, and such that L(ai, . . . , old y , <9x) = 
L(po/PD Y , ■ y ,PD Y -i/pD Y ,d x )- 

Finally, M = p^ L(po/pD Y , ■ ■ . ,pd y -i/pd y ,dx) is in 
KLY](9x }, has order r and degree at most mx + DxrriY in 
X, and is associated to P. The conclusion of Thm. 1 thus 
follows from the uniqueness of the minimal operator. 

The rest of the proof consists of elementary properties 
of symmetric polynomials. If r < n then the group S r of 
permutations of {l,...,r} is naturally embedded into S n - 
We write (S n /S r )i for the left quotient of S„ by S r . 

Lemma 1. Let E be a field, let r < n be nonnegative in- 
tegers, and let ai,...,a n be pairwise distinct elements in 
E. If E £ E[Yi, . . . , Y r ] is symmetric and does not vanish 
at (qi, . . . , a r ), then there exist nonnegative integers e r +i < 
r, . . . ,e n < n — 1 such that 

e= w<d, ■ • • . ^(,))y; ( ; + + 1 D • • • Y X) 

is a symmetric polynomial in E[Yi, . . . , Y n ] that does not 
vanish at (ai, . . . , a„). 



2.3 Algorithm and Complexity 

With the bounds of the previous section, we now turn 
to the computation of the differential resolvent. As in [8, 
13], the idea is to compute power series expansions of the 
coefficients and recover them by Pade approximants. For 
this, we have to choose a point where expansion is easy and 
determine a bound on the order of the series expansions. 

Definition 1. A point ogK that annihilates neither the 
leading coefficient po Y nor the discriminant Ay of P seen 
in K[A][Y], and such that Vo(a), . . . , V r -i(a) are linearly 
independent in K.[Y]/(P(a, Y)) is called a lucky point. 

If a is lucky, then the cti's can be seen as power series in 
K[[X-a}}, and the Ws as elements of K[[X-a]][Y]/(P(a, Y)). 
Here K represents the algebraic closure of K. 

Lemma 3. There exists $ G K[X] \ {0} of total degree at 
most n (as defined in Thm. 1 ) such that a is lucky whenever 
PD Y {a)A Y (a)$(a) / 0. 

Proof. For "!> we take the leading coefficient of the op- 
erator M introduced at the end of the proof of Thm. 1. If 
Pd y (a) Ay (a) 7^ then we can define A(a) as the r x Dy 
matrix {a^ (a))i,j with entries in ~K[[X — a]], B(a) as 
the Dy x r matrix in K[[X — a]] whose columns are the 
coefficients in Y of Vo,...,V r ~i, and C(a) as the invert- 
ible Vandermonde matrix of the ai. If $(a) 7^ then 
A(a) has rank r by construction, and so has B(a) since 
A(a) = C(a)B(a). □ 

Lemma 3 implies that all but a finite set of elements in K 
are lucky. A lucky point a can thus be obtained either deter- 
ministically by computing the rank of Vo(a), . . . , V r -i(a) for 
more than deg x (po y Ay$) G 0(Dx D Y ) points, or prob- 
abilistically by means of the Schwartz-Zippel zero test [23, 
Lemma 6.44]. Once a lucky point is determined (which is 
very fast by the probabilistic approach) then the computa- 
tion of the differential resolvent of P can be done efficiently. 



Proof. For fixed r, the proof proceeds by induction on n. 
If n = r, then the lemma trivially holds. We now assume 
that the lemma holds for some n > r, and prove that it 
holds for n + 1. The induction hypothesis provides us with 
nonnegative integers e r +i < r, . . . , e n < n — 1 such that 
E (as defined in the lemma) is a symmetric polynomial in 
Yi, . . . , Y n that does not vanish at (ai, . . . , ct n ). Since the 
Vandermonde matrix of ati, ■ . . , o n +i is invertible, there ex- 
ists e n +i < n such that 

J2 E(Y pm ,...,Y p(n) )Y^ + \ } 
pe(s„ +1 /s„), 

is symmetric in Yi, . . . , Y n +i and moreover does not vanish 
at («i, . • . , Q n +i). Plugging the expression of E into the 
latter sum concludes the proof for n + 1 . □ 

Lemma 2. [24, Thm 6.21] Let A be a commutative ring, 
and let E G A[Yi, . . . , Y n ] be symmetric of partial degree D in 
each of the Yi 's. Then there exists a unique polynomial F G 
A[Zi, . . . , Z n ] of total degree D such that E(Yi, . . . , Y n ) — 
F(Si, . . . , £„), where Ei represents the i-th elementary sym- 
metric polynomial in Yi, . . . , Y n . 



Proposition 1. Given a lucky point a G K and the or- 
der r of M , the computation of M can be done with 0{(r u + 
rM(Dy) + r\og(Dx))M(r/)) arithmetic operations in K. 

Here 2 < lj < 3 is the exponent of matrix multiplication, 
see, e.g., [23]. 

PROOF. First of all, shifting the variable X by a in P 
takes 0(DyM(Dx)) operations in K by means of [4, Ch. 1, 
§2] since the characteristic is zero. In order to reconstruct 
the coefficients of M from their series expansions via Pade 
approximants [23, §5.9], a precision (X — a) 2>7+1 is suffi- 
cient. The idea is to follow the steps of Cockle's algorithm 
inK[[X-o]][Y]/(P) instead of K(X)[Y]/(P). The precision 
of the series is initially set to 2n + 1 + r G 0(rf) in order to 
accommodate the precision loss due to differentiation. 

In Step 1, the computation of Vi takes 0(M(D Y )M(ri)) op- 
erations by [23, Thms. 15.10 & 15.11]. In Step 2, each V k+1 
can be obtained from Vk and Vi by means of C(M(_Dy)M(7;)) 
operations. In Step 3, we start by computing the approxi- 
mation of the relation (1) to precision (A' — a). This boils 
down to linear algebra over K in size (r + 1) x Dy, hence 
a cost in C(r" _1 Dy). From this we use Hensel-Newton 
lifting in order to recover relation (1) to precision (A — 



a) 2v+1 with 0(r u M{rj)) operations in K. The rational re- 
construction of each A* and its shift back of X by —a cost 
0(M(i7)log(fj)). □ 

When r = Dy the cost of the computation of M drops 
to 0(DxD Y +2 ) when fast multiplication is used. (Here the 
tilde indicates that the factors polynomial in log Dx , log Dy 
and their logarithms have been omitted.) Note that the 
results presented in this section extend to analyze the bit- 
complexity via p-adic approximation when K = Q. 



3. CREATIVE TELESCOPING 
FOR ALGEBRAIC FUNCTIONS 

In this section we search for differential operators associ- 
ated to P in K(X){dx) with non- minimal orders but with 
degrees in X much smaller than in the resolvent of P. 



3.1 Principle 

Let a(X) be a solution of P{X,a{X)) = 0. 
C[X, Y], then by the residue theorem, 



If P € 



a{x) = i Lf F(x ' Y)dY > with F(x ' Y) = Y p Y (x X Y) ) 

where the contour is for instance a circle containing a(X) 
and no other root of P. In this setting, creative telescoping 
can be seen as an algorithm based on differentiating under 
the integral sign and integrating by parts [2]. It consists in 
computing a linear differential operator of the form 



A = A(X, d x ) + d Y B(X, Ox, Y, dy) 



(2) 



that cancels the rational function F. There, dx and dy 
denote differentiation with respect to X and Y. If such a A 
exists, integrating A(YPy/P) = shows that A cancels a. 
We now give an algebraic proof of this property. 

Proposition 2. Under (H), suppose that an element A £ 
K[X, Y]{dx, dy) is of the form (2) and that d Y A cancels 
the rational function F — YPy/P for some integer k > 0. 
Then, the operator A is associated to P. 



Proof. The element F = YP Y /P of K(X,Y) can be 
expanded as a Laurent series in K(X,a)((Y — a)): 



Y -a 



with coefficients 7; in K(X, a). On the other hand the action 
of derivations in X and Y on a summand ^y(Y — a) 1 , for 7 
in K(X, a), is given for any i £ Z by 

dy (j(Y-a)^ = i 7 (Y -a)*' 1 

dx (^/(Y — a) 1 ^ = j'(Y — a)* — vy{Y — a) l_1 a'. 

Thus no term in (Y — a)" 1 remains after action of dy. In 
this way, applying d Y A to F and considering the resulting 
Laurent series yields 

= d Y A { F) = dyA y^-) = 0, 

which proves that A annihilates a. □ 



3.2 Quadratic Bound 

We now prove Thm. 2 by considering the degrees of the 
numerators and denominators of the successive derivatives 
of the rational function F = YPy /P. By Leibniz's rule, 
for any nonnegative integers Nx and Ng, the rjv x ,jv a := 
(N x + 1) (^V" 2 ) rational functions 

X'&xd&F, 0<i<N x , < j + k < N a 

are contained in the K-vector space of dimension at most 
s Nx ,N a := (Dy(N a + i) + i)(N x + l + Dx(N a + l)) spanned 
by the elements 

-^J-, 0<i<N x +Dx(N a + l), 0<j<D Y (N a + l). 

With the values of Nx and N a given in the theorem, the 
difference rN x ,N — SN x ,N a becomes 

(12,Dy - 7D Y - 1)D X + 12D Y + SD Y 

which is positive for positive Dy. 

This proves the existence of a nonzero differential operator 
A(X, dx, dy) with degree in X at most Nx and total degree 
in dx and dy at most N a that annihilates F. Let now k < 
N a be such that A rewrites d Y (A(X, d x ) + d Y B(X, d x ,d Y )) 
with A ^ 0. The conclusion follows from Prop. 2. 

3.3 Refined Bound 

We now prove Thm. 3. The bound of the previous section 
relies on the elimination of Y in the left ideal annihilating F 
in the Weyl algebra Wx,v(K). In the context of creative 
telescoping, this is known as Zeilberger's slow algorithm, 
which relies on holonomic D-modules [25]. A faster algo- 
rithm for hyperexponential integrands [2] consists in look- 
ing for an operator of the form (2). Indeed, letting B in (2) 
depend on Y also allows for the existence of operators of 
smaller order. In this context, the only good bounds [3] 
we know of do not apply to rational integrands. Instead, 
we give a construction inspired by one of Takayama [20, 
Thm. 2.4 and Appendix 6] showing that any differentiably 
finite function is holonomic. 

For each integer d > 0, we consider the vector space Vd of 
rational functions G/P d+1 , with polynomials G £ K[X, Y] 
satisfying 

degG< (d+l)D + d, deg A - G < (d+ l)Dx + d, 
degy G < (d+l)D Y . 

This is defined in such a way that an induction on d shows 
that the vector space Fd generated by the rational functions 



X^x-F, 0<i,j<d 



(3) 



is a subspace of Vd, and that so is dy ■ Vd-i- We are going 
to establish that for d large enough, 



dim K F d + dim K (Vd n dy ■ V d +i) > dim K V d 



(4) 



From here the result follows: both F d and Vd H dy ■ V d +i are 
subspaces of V d whose dimensions are such that they have a 
nonzero intersection. Any element of this intersection is of 
the form AF = dy(G/P d+2 ) with AF € F d . By the same 
arguments as in the proof of Prop. 2, we deduce that A is 
associated to P. 

We now consider the dimensions in (4) more precisely. 



The generators (3) are linearly independent since Py 7^ 0: 
the reduced form of 9* x ■ F has denominator P l+1 , which 
cannot be eliminated by multiplication by a polynomial in X 
only. Therefore diniK-Fd = (d + l) 2 . 

Write P in the form X Dx <f) + P where <j> depends only 
on Y. For d > 0, define the vector space Wd of ratio- 
nal functions F/P d , with polynomials F € K[X,Y] of the 
form F = cX {Dx+1)d+1 (f> d +H where c is in K and H satisfies 

degH<(D + l)d+l, deg x H < (D x + l)d, (5) 
degy H < D Y d. (6) 

The decomposition of F induces Wd Q Vi+i directly. Then, 
from 



p^Q Y ( Z_ ) = cd^X^X+^+^PdYS 



pd 



'P) 



+Pd Y H - dHd Y P, 



we obtain the inclusion dy ■ Wd C Vd. It follows that dy ■ 
Wd *~ Vd n dy ■ Vd+i, and that Eq. (4) holds as soon as 



dim K F d + dim K dy ■ Wd > dim K Vd. 



(7) 



In order to compute the dimension of dy ■ Wd, we first con- 
sider the kernel of dy\w d , which is precisely K[X] Pi Wd. 
In view of the degree constraints (5-6), this corresponds to 
numerators F - P d Q with Q € K[X] and 

deg x Q < (D x + l)d + 1 - D x d = d+l, 

so that dim K (<9 y • W d ) = dim K Wd - (d+ 2). 

The dimensions of Vd and Wd are given by the possible 
supports of the numerators of their elements. Lemma 4 
below thus provides us with: 



dim K Vd = (d+l)(D x + l)((d+l)D Y + 1) 



dim K Wd = ((Dx + l)d + 1) (Dyd + 1) 



dA + A + 1 



dA 



+ 1. 



Injecting the three dimensions in Eq. (7) finally gives the 
following condition: 



d 2 - (2D x Dy + D Y - A 2 - A - l) d 



+ (A 1)A - (Dx + l)(Dy + 1) > 0. 



Call ip the value obtained on substituting the bound of the 
theorem for d in the left-hand side: 

ip = (3Dy - l)Dx + Dy~ 3A(A + l)/2 + 2. 

From D Y > 1 and < A < mm(D x ,D Y ) =: m > 1 it 
follows that 

■4> > (3m-l)m+m-3m(m+l)/2+2 = 3m(m-l)/2+2 > 0. 



3.4 Construction of Differential Operators 

Given a polynomial P £ K[X, Y] satisfying (H), we have 
proved in the previous sections the existence of several oper- 
ators in K[X](9x) associated to P. The aim of this section 
is to study the efficient construction of such operators. This 
is done under an additional, but mild, hypothesis ((FT) be- 
low). As a by-product, we obtain in Thm. 4 an alternative 
unified proof of slightly weaker versions of Thms. 1, 2 and 3. 

Direct Algorithms. The proof of Thm. 2 boils down to a 
kernel computation of a matrix of size ~ DxDy. Even using 
FFT for (univariate and bivariate) polynomial multiplica- 
tion and a Wiedemann-like approach for the linear algebra 
step, this computation requires ~D X D Y operations in K. 

In the sequel we provide an algorithm with better cost, 
which is based on Pade-Hermite approximation of series ex- 
pansions of an algebraic solution and its derivatives. 

Notation and Assumptions. We assume in the rest of this 
section that the following hypothesis holds. 

Hypothesis (H'). Besides Hypothesis (H), 

(H a ) Dy > 2 and P is absolutely irreducible in K[X, Y] 
(which means that P is irreducible when seen over the 
algebraic closure K ofK), and 

(H a ) x = annihilates neither the discriminant Ay(X) nor 
the leading coefficient Pd y (X) of P, seen in K[X][Y]. 

To clarify the status of these assumptions, note that (H a ) 
is a global property of P which is crucial in the subsequent 
proof that our algorithm works correctly. In contrast, the lo- 
cal property (Ht) is purely of technical nature, being similar 
with that on lucky points in Section 2.3. (Ht) can be ensured 
deterministically and in good complexity after a change of 
coordinates X <— X + a with a G K not a zero of Pn Y Ay. 

Principle of the Algorithm. Under (H b ), the polynomial 
p(Y) := P(0,Y) is separable of degree Dy. By Hensel's 
lemma, the Dy roots an of P(X,cti) = are all in K[[X]]. 
To avoid the factorization of p(Y), we work in the quotient 
algebra A := K[Y]/(p(Y)). Let y denote the residue class 
of Y in A. By Hensel's lemma again, there exists a unique 
series tp G A[[X]] such that ip(0) = y and P(X,tp(X)) = 0. 
Moreover, the expansion of tp to any prescribed precision 
can be efficiently computed by Newton iteration. 

To give a unified presentation, we denote by d one of 
the derivation operators dx and 6x- The principle of our 
algorithm is to compute enough terms of <p(X), so that the 
coefficients of an associated operator can be recovered by 
Pade-Hermite (abbrev. PH) approximation of the vector 
((p,dip,d 2 ip, . . .)* of the first derivatives of tp wrt d. Thus, 
we obtain first an operator with coefficients in A[X], from 
which we extract an operator in K[X]{d) associated to P. 
We give in Figure I a detailed presentation of the algorithm. 
The main difficulty of this approach is to provide suitable 
bounds on the orders of truncation. This is done in the next 
theorem, which encapsulates the main results of this section. 



Lemma 4. The number of monomials in X and Y of total 
degree at most 5, degree in X at most 8x , and degree in Y 
at most S Y is (8 X + l)(5y + 1) - whenever 
Sx + Sy + 1 > S > m&x(Sx ,Sy). 



Theorem 4. Let P £ K[X,Y] be of bi-degree (Dx, Dy) 
and satisfy Hypothesis (H 1 ). The number of ops in K for 



1. AlgToDiff (P,4:DxDY,DY,d) is 6(D X D 



A\gToD\f((P,Bx,B a ,d) 

Input: PeK[X,Y], B x ,B a G N, d G {dx,O x } 

Output: an operator L G K[X](<9}\{0} associated to P, 
with deg x (L) < Bx and deg e (L) < Bg. 

1. Set E := B x B a + B x + B a . 

2. Expand <p{X) G A[[X]] up to precision E + Bg. 

3. Compute Zi = d l <p, for i = 0, 1, . . . , Ba to precision E. 

4. Compute a PH approximant (lo(X), . . . ,£ Bg (X)) ^ 
in A[X] B ° + 1 of (Z ,...,Z Bg ), of type (B X ,...,B X ). 

5. Write L = ^o(X)+^i(X)9 + ---+^i3 a (X)a So as 
Lo + yL- L + --- + y DY ~ 1 L DY - 1 , with L, G K[X]<5). 

6. Return any Li 7^ 0. 



Figure 1: Construction of differential operators by 
Pade-Hermite approximation. 

2. AlgToDiff(P,5Dxi3y,5Dy,9) isd(D x D Y + 2 ); 

3. AlgToDiff(P, B, B, d) with B = 4D x Dy+D y -2D x -2 
is6(D^ +1 D Y + 2 ). 

All three output a non-zero differential operator in K[X](d) 
associated to P. 

The first algorithm returns an operator of order at most Dy, 
but with coefficients of cubic degree. The second one returns 
an operator whose order is still linear in Dy, but whose 
coefficients have only quadratic degree. Finally, the last one 
returns an operator of quadratic order and degree, whose 
degree is smaller than the previous one. 

First we give a bound on the number of terms needed to 
reconstruct an operator. 

Lemma 5. Under (H') f let L G K[X](d) \ {0} be of order 
at most Bg in d, with coefficients of degrees at most Bx ■ 

(a) If ao G is such that P(X, ao) = and Lao = 0, 
then L is associated to P. 

(b) Leta = 4DxD Y Bg + BxDY-2DxBg. Ifa eK[[X]] 
is such that P(X,ao) and Lao are zero modulo X" , 
then L is associated to P. 

Proof. There exists Q(X, Y) G K[X, Y] with 

deg x (Q) < 2BgD x + B x - D x 
degy(Q) < 2(D Y - l)Bg -D Y +2, 

such that La — p ^} X ^2B a -i f° r anv ro °t a G K(X) of 
P(X,a) =0. Indeed, if L = £ (X) + ■ ■ ■ + £ Bg (X)d Ba , we 
take Q = loYPy 3 ^ 1 + £ 1 WiPy Bo ~ 2 + ■ ■ -+£ Ba W Ba , where 
the Wi(X,Y) are the polynomials introduced in Section 2. 

Let R(X) G K[X] be the resultant of P(X, Y) and Q{X, Y) 
with respect to 7. To show that L is associated to P, it is 
enough to prove that R is identically zero. Indeed, since 
P is irreducible in K[X, Y] , the nullity of R implies that P 
divides Q [23, Cor. 6.20], therefore Q(X, a) = 0, and thus 
La = 0, for any a G K(X) such that P(X, a) = 0. 

(a) By definition, there exist polynomials U, V G WL[X, Y] 
such that U(X,Y)P(X,Y)+V(X,Y)Q{X,Y) = R{X). Eval- 
uating this equality at Y = ao yields R(X) = 0, as wanted. 



The same reasoning shows that under the hypothesis of (b) 
we have R(X) — mod X" . On the other hand, by Bezout's 
theorem, one can bound the degree of the resultant R by 
Dx degy Q + Dy deg x Q. Therefore deg(ii) is at most 

Dx(2D Y Bg - 2Bg -D Y + 2) + D Y (2B a Dx +B X -D X ) 
< 4D x D Y Bg + B X D Y ~ 2DxB a = o, 

since D Y > 2. Finally deg(i?) < a and R(X) = mod X a 
imply that R is the zero polynomial. □ 

PROOF of Thm. 4. In Step 4, the Pade-Hermite equality 
liZi =0 mod X s can be viewed as an under-determined 
homogeneous linear system with E equations in the E + 1 
unknown coefficients of the polynomials ii. It thus admits 
at least one non-trivial solution in A s+1 . 

Next let p — YliPi represent the irreducible factoriza- 
tion of p in K[Y], and let A; = K[Y]/(pi(Y)), so that A 
rewrites into the product of fields ni-^- Let ip^' be the 
image of tp in Ai[[X]] C K[[X]] and let L^ be the image 
of L in Ai[X]{d) C K[X]{d). By construction P{X,ip (i) ) = 
L (i) ip {i) = mod X s holds in K[[X]. Since all choices of 
(Bx,Bg) as specified in Thm 4 ensure E > a, it follows 
from Lemma 5(b) that L^ is associated to P for all i. In 
other words, for any root a G K of p(a) — 0, and any root 

a G K[[X]] of P(X,a) = we have that L a + aLia^ h 

aP Y ^ x Lo Y -\a = 0. We thus deduce that Li is associated 
to P for all i, which concludes the proof of correctness. 

We are left with the detailed description and complex- 
ity study of each step of the algorithm. Operations (+, x) 
in A have cost 0(M(Dy)) operations in K, while inversion 
(when possible) takes 0(M(Dy ) log Dy). For simplicity, in 
what follows we suppose that FFT is used for polynomial 
multiplication, thus M(d) G 0(d) for all d. 

For the lifting step 2, we use the Newton iteration. By [7, 
Algo. 2, Prop. 3], under the hypothesis (H'), the first E terms 
of ip G A[[X]] cost 0(ED Y (UJ+1)/2 + x/S7M(E)M(D y )) G 
6(B x BgD ( y +1)/2 ) ops in K. 

Since differentiating has linear complexity, Step 3 requires 
O(BgBx) ops in A, hence at most <D(BgBx Dy) ops in K. 

For Step 5, we use the approximation algorithm in [19], 
which computes a nonzero Pade-Hermite approximant of 
type (Bx, Bx) for (Z , ...,Z Bg ) within O(B^Bx) ops 
in A, provided that A is a field, or more generally, if no 
division by a non-invertible element in A occurs. 

In the general case, since A is not a field, we use the fol- 
lowing strategy. We begin by applying the approximation 
algorithm over A. If no forbidden division occurs, this al- 
gorithm outputs the desired operator L with coefficients in 
A[X]. If we encounter a nonzero q G A that is not invertible, 
then by a gcd computation we obtain a proper factor p of 
p of degree at most Dy/2 such that q is either invertible 
or zero modulo p. This takes O(Dy) operations in K [23, 
Th. 11.5]. We replace A by K[Y]/(p(Y)) and restart the ap- 
proximation algorithm. This procedure is repeated at most 
log Dy times, until all the divisions are doable. The total 
cost of this computation is 0(Bg Bx Dy) operations in K. 

Finally, Steps 1, 5 and 6 do not use any arithmetic opera- 
tion. Adding together these costs, and under the assumption 
Dy < Bg (satisfied by the choices in Thm. 4), we get that 
AlgToDiff(P, B x ,Bg, d) uses 6{B%B X D Y ) ops in K. □ 



Probabilistic and Heuristic Versions. We now give a 
probabilistic version of Algorithm AlgToDiff, called AlgToD- 
iffP. The idea is to compute a Pade-Hermite approximant 
over K rather than over A. The probabilistic algorithm Al- 
gToDiffP is obtained by replacing Steps 3-6 in AlgToDiff by: 
3'. Let <p(X) = MX) + ypi(X) + ■■■+ j/ Cy -Vcy-iW- 
Choose random ao, oi, . . . , cld y -± £l \ {0} and compute 
Z = Yji<H<Pi{X) mod X s , Zi =d'Z ,i = l,...,B g . 
4'. Compute a PH approximant (lo(X), . . . ,iB g (X)j in 
K[X] Ba+1 of (Z ,Z 1 ,...,Z Bg ) of type {B X ,...,B X ). 
5'. Return L = £ (X) + £ 1 (X)d + ■ ■ ■ + £ Ba (X)d Ba . 

MgToD\ffP(P,5D x D Y ,5D Y ,d) uses 6(D x D ( y +5)/2 ) ops 
in K; with B as in Thm. 4, the cost of AlgToDiffP(P, B, B, d) 

is 0(D 2 x D { y +5)/2 + (D x DyY +1 ). The exponent of D Y is 
better than for the deterministic algorithms. 

Another improvement comes from the remark that the 
algorithms derived from AlgToDiff and AlgToDiffP do not 
return operators of order and degree as small as predicted 
by Thms. 2 and 3. The reason for this is technical: we need 
the values B x , Bg it uses in order to prove that the opera- 
tor constructed by Pade-Hermite approximation cancels <p 
and not merely its truncation. Replacing arguments B x , Bg 
by the better bounds from Thms. 2 and 3 we obtain heuris- 
tic modifications AlgToDiffP(P, 3D X D Y +6D Y , 6D Y , 8) and 
AlgToDiffP(P, B, B, d) with B as in Thm. 3. So far, we do 
not have a proof that these heuristics always return a correct 
output, but they behave very well in practice (see §4). 

3.5 Unrolling the Recurrence 

The operators from Thm. 4 let compute series expansions 
of algebraic functions asymptotically faster than Newton it- 
eration. The following quantifies the improvement. 

Theorem 5. Let %p e h[[X]] be such that P(X,ip(X)) = 
0, where L = K(a) is a finite extension of K C C. Then, 
under (H'), the first N terms of if) can be computed in 
O (ND x D Y ([h : K] + U(D Y )/D Y )) ops in K, as N -> oo. 

In the case when L = K, the complexity we get is therefore 
linear in N and quadratic in the degree if fast multiplication 
is used. A better complexity O(ND) was announced by 
Chudnovsky & Chudnovsky in [8, 9]. While the dependency 
in N is correct, they seem to have overlooked the degree 
growth of the coefficients of the differential resolvent, that 
gives a cubic dependency in D for their algorithm. 
Our algorithm is presented in the proof below. 

Proof. The first part of the computation is indepen- 
dent of N. We begin by calling AlgToDiff(P, B x , Bg , d) with 
B x = 5D X D Y , Bg — 5D Y and d = 8 x . We then convert 
the operator into a recurrence of the form 

r (n)u n + n(n)u n+ i H h r Bx (n)u n+Bx = 0, for n > 

with coefficients ri(X) £ KLY] of degree at most Bg. Let p 
be the largest nonnegative integer root of the polynomial 
r Bx (X), or —1 if no such root exists. Thus for any n > p, 
u n + Bx can be computed from the previous ones using the 
recurrence. Note that we also have at our disposal the initial 
segment ~^2^ ' 1 u„X" of <p{X). For practical purposes p 
can be replaced by an upper bound like a Cauchy bound 
(see, e.g. [24, Lemma 6.7]). If p / —1, we use Newton 
iteration to extend this initial segment up to the index p + 
B x if necessary. Using the minimal polynomial of a, we 
then deduce the corresponding segment of tp. 



Only the rest of the computation depends on N. The 
recurrence is unrolled for n from p+ 1 to TV — B x — 1. Since 
the coefficients rj belong to K, so do the values r^(n) and 
thus each new coefficient u n of ip can be obtained using 
0{B x [TL : K]) <= 0{D x D Y [h : K]) operations in K, provided 
that the values taken by the coefficients r;(X) at the points 
p + 1, . . . , N — B x — 1 have been computed and stored. 

This computation can be done separately for each coef- 
ficient ri(X) using fast multipoint evaluation on p+l,p + 
2, . . . , p + Bg and then fast extrapolation on arithmetic se- 
quences. The cost of all the fast multipoint evaluation steps 
is 0(B x M(Bg) log Bg) and that of all the fast extrapolation 
steps is 0{B x NM(Bg)/Bg), see e.g. [23, 6]. □ 

We note that Hypothesis (H') is not strictly necessary in 
practice: the same method handles singular cases as well, 
provided power series solutions exist at the point of interest. 

4. EXPERIMENTS 

4.1 Timings 

We have implemented the algorithm of §2.3 in Magma V2.ll- 
14 [5]. In Table 1 we report on our experiments with ran- 
dom dense bivariate polynomials over the prime field with 
9973 elements on a Pentium (M) 1.8GHz processor. Column 
"ser." indicates the time (in sec.) of our algorithm, while 
column "rat." is for Magma's Dif f erentialOperator, where 
we removed the characteristic requirement. The latter rou- 
tine implements Cockle's algorithm in K(A) and has been 
contributed by A. van der Waall. Column "77" indicates the 
value of r\ and column "deg x (M)" gives the actual degree 
in X of the minimal operator. We observe that Cockle's 
algorithm is always faster when run over the series and that 
there is still room for improvement on our degree bound 77. 

In Table 2, we compare the practical behaviour of our 
heuristic algorithm from §3.4 for algebraic series defined by 
polynomials of bi-degree (1, D Y ) with that of Newton's iter- 
ation (column Newton). Column AlgToR.ec contains the 
cumulated timings of the heuristic AlgToDiffP(P, B, B, d) 
(with B as in Th. 3) and of the conversion of a differen- 
tial operator into a recurrence. Column gives the number 
of terms of the expansion, while column unroll shows the 
timings for unrolling the recurrence. One motivation for the 
computation when D x — 1 is its application in univariate 
polynomial root finding by homotopy in [10, § 10]. 

These experiments show that there is an initial cost due 
to the size of the recurrence being constructed. Once the 
number of desired coefficients is high enough compared to 
this size, then the method is very effective. 

4.2 On the Optimality of the Bounds 

Conjectures. Intensive experiments suggest that the de- 
gree of the differential resolvent is bounded by D(D 2 — 
5D/2 + 5/2) for polynomials of total degree D and D(2D 2 - 
3D + 3) for polynomials of bi-degree (D,D), within a fac- 
tor 2 of the bound of Thm. 1. Also, the bound from Thm. 3 
seems only slightly pessimistic: for a polynomial of total 
degree D, there often exists a recurrence of order D 2 — 2 
with coefficients of degree at most D 2 — 1. For a polyno- 
mial of bi-degree (D x , D Y ), experiments suggest the bound 
2D x D Y ~2- {D x - D Y ) if D Y > 1 and D x + 1 if D Y = 1. 



Table 1: Cockle's 


algorith 


m 


(D X ,D Y ) 


ser. 


rat. 


r, deg x {M) 


(1,1) 


0.002 


0.002 


2 


2 


(2,2) 


0.003 


0.004 


17 


10 


(3,3) 


0.02 


0.03 


69 


36 


(4,4) 


0.10 


0.16 


182 


92 


(5,5) 


0.47 


0.98 


380 


190 


(6,6) 


1.86 


4.56 


687 


342 


(7,7) 


5.80 


16.5 


1127 


560 


(8,8) 


15.5 


49.9 


1724 


856 


(9,9) 


38.0 


138 


2502 


1242 


(10,10) 


72.7 


340 


3485 


1730 


rable 2: Newton's 


algorith 


m vs. AlgToDifTP 


(D X ,D Y ) 


N AlgToRec 


unroll 


Newton 


(1,8) 


2 ±u 


6.66 


0.22 


1.75 


(1,8) 


oil 


6.66 


0.44 


3.91 


(1,8) 


2 12 


6.66 


0.89 


8.83 


(1,8) 


2 13 


6.66 


1.8 


18.84 


(1,8) 


2 14 


6.66 


3.67 


42.53 


(1,8) 


2 15 


6.66 


7.38 


94.99 


(1,9) 


2 iu 


10.54 


0.26 


2.14 


(1,9) 


2 11 


10.54 


0.44 


4.35 


(1,9) 


2 12 


10.54 


1.07 


9.31 


(1,9) 


2 13 


10.54 


2.17 


20.64 


(1,9) 


2 14 


10.54 


4.36 


43.44 


(1,9) 


2 15 


10.54 


8.75 


97.64 



Lower Bound. We note also that D(D — 1) is the degree of 
the discriminant of a generic polynomial P of total degree D. 
Generically, this discriminant is square-free. The differential 
equation satisfied by all solutions of P has to be singular 
at the roots of the discriminant and therefore its leading 
coefficient cannot have a smaller degree. Thus our bound 
on the order of the recurrence is asymptotically optimal. 

An example illustrating D(D — 1) as a lower bound is 
provided by P(X, Y) = Y D - Y + X D . Then a — X D + 

X° 2 + 0(X d2 ) is a root of P seen in Q[[X]][Y]. It has 
a sequence of D(D — 1) — 1 zero coefficients before a non- 
zero one. Thus, the order of the recurrence satisfied by the 
coefficients of a is at least D(D — 1). 
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